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Abstract. A Bayesian approach to solar flare prediction has been developed, which uses 
only the event statistics of flares already observed. The method is simple, objective, and 
makes few ad hoc assumptions. It is argued that this approach should be used to pro- 
vide a baseline prediction for certain space weather purposes, upon which other meth- 
ods, incorporating additional information, can improve. A practical implementation of 
the method for whole-Sun prediction of Geostationary Observational Environment Satel- 
lite (GOES) events is described in detail, and is demonstrated for 4 November 2003, the 
day of the largest recorded GOES flare. A test of the method is described based on the 
historical record of GOES events (1975-2003), and a detailed comparison is made with 
US National Oceanic and Atmospheric Administration (NOAA) predictions for 1987-2003. 
Although the NOAA forecasts incorporate a variety of other information, the present 
method out-performs the NOAA method in predicting mean numbers of event days, for 
both M-X and X events. Skill scores and other measures show that the present method 
is slightly less accurate at predicting M-X events than the NOAA method, but substan- 
tially more accurate at predicting X events, which are important contributors to space 
weather. 
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1. Introduction 

Large solar flares are associated with a variety of space 
weather effects. Those effects which occur promptly moti- 
vate flare prediction, or forecasting. For example, soft X-ray 
enhancements due to large flares cause increased ionization 
of the upper atmosphere, which can cause short-wave ra- 
dio fadeouts, and Australia's Ionospheric Prediction Service 
(IPS) issues flare predictions on this basis. 1 Delayed space 
weather effects allow the possibility of physical modelling, 
given the knowledge that the flare has occurred. In this 
case prediction of the flare itself may be less important. 

Solar flare prediction remains in its infancy, because of a 
lack of detailed understanding of the physical processes un- 
derlying flares [e.g., Priest and Forbes, 2002]. Existing pre- 
diction methods are probabilistic. One popular approach re- 
lies on the Mcintosh optical classification of sunspots, which 
divides sunspot groups into 60 classes based on three pa- 
rameters [Mcintosh, 1990; Bornmann and Shaw, 1994]. The 
historical rate of flaring for a given classification provides 
an initial estimate for the expected flaring rate of flaring 
of an observed sunspot group. This approach is the basis 
for predictions published by the Space Environment Cen- 
ter of the US National Oceanic and Atmospheric Adminis- 
tration (NOAA) 2 as well as NASA [Gallagher, Moon and 
Wang, 2002] 3 and IPS. NOAA/SEC uses an 'expert system' 
developed by Mcintosh [1990], and adopted in 1987. The 
associated code begins with the Mcintosh classification but 
also incorporates additional information, including dynami- 
cal properties of spot growth, rotation and shear, magnetic 
topology inferred from sunspot structure, magnetic classi- 
fication, and previous (large) flare activity. The method 
involves more than 500 decision rules including 'rules of 
thumb' provided by human experts. 

A variety of properties of active regions are known to 
correlate with flare activity, and in principle could be incor- 
porated into predictions. Attention has focused on photo- 
spheric magnetic field measurements. For example, Sammis, 
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Tang and Zirin [2000] confirmed that most large flares occur 
in large, magnetically complex regions. Studies of vector 
magnetograms have suggested the length of the strongly- 
sheared strong field region along a neutral line as a predictor 
of flares and coronal mass ejections (CMEs) [e.g., Hagyard, 
1990; Falconer, 2001]. Rapid emergence of new magnetic 
flux is also often identified as being associated with flare oc- 
currence [e.g., Schmieder et al., 1994]. Recently Leka and 
Barnes [2003] examined the relationship between moments 
of quantities constructed from vector magnetic field maps 
and flaring. 

There are many problems with existing methods of 
flare prediction. One problem with classification-based ap- 
proaches is that they tend to ignore the variability in flaring 
rate within a class. A related difficulty is that choices for 
classes are ad hoc, and possibly subjective. For example, the 
Mcintosh classification is an arbitrary construction — other 
choices of the three parameters could be made, and different 
observers might disagree with a given classification. Similar 
criticisms apply to the inclusion of additional information, 
e.g. properties of an active region, in existing methods of 
prediction. The choice of properties is essentially arbitrary, 
and it is unclear how the information can be included in an 
objective way. 

The best indicator to future flaring activity is past flaring 
activity [e.g., Neidig, Wiborg and Seagraves, 1990]. (In the 
prediction literature, the tendency of an active region which 
has produced large flares in the past to produce large flares 
in the future is termed persistence.) Recently Wheatland 
[2004a] presented a Bayesian approach to solar flare predic- 
tion which uses the observed history of large and small flares 
together with the known phenomenological rules of flare oc- 
currence to make a prediction for flaring. This 'event statis- 
tics' approach has the advantage that it depends only on 
past flaring activity, and so avoids the ad hoc choices im- 
plicit in other prediction methods. The method has been 
developed into a practical automated prediction scheme for 
whole-Sun prediction of soft X-ray flares based on NOAA 
solar event lists for the Geostationary Observational Envi- 
ronmental Satellites (GOES). The method has been tested 
on the historical catalog of GOES events, and a brief account 
of this test was given in Wheatland [2004b] . 
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Because of its simplicity and relative objectivity, the 
event statistics method is well suited to providing a baseline 
forecast for whole-Sun flaring for space weather purposes. 
Other methods of prediction, incorporating additional in- 
formation, may then be applied to improve upon this base- 
line. In this paper the method and its implementation are 
described in detail. Section 2 reiterates the simple theory 
of the method [Wheatland, 2004a]. Section 3 describes the 
practical implementation for whole-Sun prediction of GOES 
events, and section 4 gives a detailed account of the results 
of the test of this implementation on historical GOES data, 
including comparison with NOAA predictions for 1987-2003. 
Section 5 presents a brief summary and discussion. 

2. Event statistics method 

It is well known that the size distribution of flares (i.e. the 
distribution of some measure of flare size, such as peak flux 
in soft X-rays) obeys a power-law distribution [e.g., Crosby, 
Aschwanden and Dennis, 1993]. For a given choice S of the 
measure of size the distribution may be written 



N(S) =x 1 (- / -i)sr 1 s- 



(1) 



where N(S)dS is the number of events per unit time with 
size in the range S to S + dS, the quantity Ai is the to- 
tal rate of events above the size Si , and 7 is the power-law 
index. The value of the power-law index depends on the 
measure of size S. For flare peak fluxes in X-ray bands, 7 
is generally found to be in the range 1.7 — 1.9 [e.g. Drake, 
1971; Hudson, Peterson and Schwartz, 1969; Hudson, 1991; 
Lee, Petrosian and McTiernan, 1995; Shimizu, 1995; Feld- 
man, Doschek and Klimchuk, 1997; Aschwanden, Dennis 
and Benz, 1998]. 

Typically we are interested in the rate of occurrence of 
large events. If we denote the large size of interest 52, then 
the corresponding rate may be denoted A 2 , and according 
to the distribution (1), this rate is given in terms of the rate 
Ai by 



A2 = A: 



(I)' 



(2) 



The power-law size distribution is one phenomenologi- 
cal rule of flare occurrence. A second rule is that on short 
timescales, flares appear to occur as a Poisson process in 
time. On longer timescales flare occurrence may be de- 
scribed as a time-dependent Poisson process [e.g., Biesecker, 
1994; Wheatland, 2001]. Assuming Poisson statistics, the 
probability of at least one large flare within a time AT is 



e = 1 - cxp(-A 2 AT). 



(3) 



Equations (2) and (3) provide a naive prediction. If the 
power-law index 7 is estimated, and the current rate Ai of 
small events is estimated, then e is the required probability. 
The Bayesian generalization involves calculating a posterior 
probability distribution P(e) for the unknown parameter e, 
given the available data and relevant background informa- 
tion [e.g., Box and Tiao, 1992]. Specifically we assume that 
the data are a sequence of M events with sizes si, S2, sm 
(where Si > Si for each i), observed during an observation 
interval from t = to t = T. The events are assumed to 
occur at times < ti < t-i < ... < tu < T. We also assume 
that an interval of time from t = T — T' to t — T has been 
identified during which the mean rate of events appears to 
be constant. A number M' of events is assumed to occur 
during this interval, which has duration T'. A practical so- 
lution to identifying this interval is provided by the Bayesian 
blocks procedure [Scargle, 1998] which is a Bayesian change- 
point algorithm designed to identify times of rate variation. 
This procedure is discussed in more detail below. As shown 



in Wheatland [2004a] , the power-law index may be approx- 
imated from the data by the maximum likelihood value 7* : 
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(4) 



and then the posterior distribution for e, based on 7*, M' 
and T' is 



P(e) = C 
x A 



ln(l-e) /S2\ 7 *~ r 



AT 



(I) 



(5) 



where A(Ai) is the prior distribution for Ai, and C is 
the normalization constant, determined by the requirement 
/ P(e)de =1. The prior distribution A(Ai) describes the 
values we would assign to Ai in the absence of any data. 
For a given prior distribution, the mean of the posterior dis- 
tribution provides an estimate for e, and the width of the 
distribution provides an estimate of the associated uncer- 
tainty. 

The approximation involved in using Equation (4) is valid 
provided the power-law index 7 is narrowly defined by the 
data, which is true for large M. For smaller numbers of 
events it is necessary to simultaneously infer 7 and e, and 
the details are in Wheatland [2004a]. For the applications 
in Sections 3 and 4, Equation (5) is found to be sufficiently 
accurate. 

3. Application to GOES events 

3.1. Implementation 

The method has been applied to event lists constructed 
from Geostationary Observational Environmental Satellite 
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Figure 1. Upper panel: probability density function 
(pdf) for the peak flux (1 - 8A) of GOES events 1975- 
2003, together with the threshold Si (vertical line) and 
the power- law model (thick line). Lower panel: cumula- 
tive distribution. 
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(GOES) observations at 1-8A. The relevant measure of size, 
S is the peak X-ray flux in the 1 — 8A channel, and it should 
be noted from the outset that this flux includes background. 
The GOES peak flux is routinely used to classify flares, with 
moderate-sized flares labelled M class (i.e. having a peak flux 
exceeding S2 = Sm = 10~ 5 Wm~ 2 ), and big flares labelled 
X class (peak flux exceeding S2 = Sx ~ 10~ 4 Wm~ 2 ). 
Hence we are interested in predicting M class and X class 
events. Predictions are made for the occurrence of at least 
one event within AT = I day of the prediction time. We 
take the time of events to correspond to the time of the 
peak flux recorded in the NOAA event lists. 

It is necessary to choose a threshold size Si above which 
event sizes are assumed to be power-law distributed. Fig- 
ure 1 shows the peak flux distribution of all GOES events 
for 1975-2003 from the historical event lists available from 
the NOAA National Geophysical Data Center. 4 The upper 
panel shows the data plotted as a probability density func- 
tion, or pdf (with bins of one tenth of a decade), and the 
lower panel shows the data plotted as a cumulative distri- 
bution (without binning). This figure shows that the data 
is power-law distributed for large peak fluxes. At low peak 
fluxes there is a departure from power-law behavior, which 
is due to the problem of identifying small events against 
the time varying soft X-ray background. A nominal thresh- 
old Si = 4 x 10" 6 Win" 2 for power-law behavior has been 
chosen, and is indicated by the vertical solid line. Also 
shown by a thick line is the power-law model distribution 
P(S) = (7* - 1)S7* _1 S" 7 *, with the the maximum likeli- 
hood index 7* given by Equation (4). For these data we 
find 7* « 2.15 ± 0.01. We note that this power law index is 
significantly larger than indices quoted in the literature for 
soft X-ray peak fluxes [e.g. Aschwanden, Dennis and Benz, 
1998], which are in the range 1.7-1.9. The reason is that 
the data used here is not background subtracted. For the 
prediction purposes it is preferable not to background sub- 
tract, because we wish to make a prediction about the flux 
including background. The method outlined here requires 
only that the quantity S is power-law distributed above the 
chosen threshold, which is confirmed by Figure 1. We will 
return to this point in Section 5. 

Predictions are made using data within a window of du- 
ration T, which is taken to span one year prior to the pre- 
diction time. Equation (4) is applied to the year of events to 
determine 7*. Then the Bayesian blocks procedure [Scargle, 
1998] is applied to the year of data. This procedure takes 
the sequence of event times ti,t2, ■■■,tM and returns a se- 
quence of changepoint times tso < isi < ■•■ < tBK at which 
the rate is determined to change (where tso and tsK are the 
start- and end-time of the data window), and a correspond- 
ing sequence A B i, A S2 , X B k of rates. The changepoints 
define K 'Bayesian blocks', i.e. intervals described by a single 
rate. The last Bayesian block provides T' and M' according 
to T = t BK - t B(K _i) and M' = XbkT'. 

The Bayesian blocks procedure involves Bayesian hypoth- 
esis testing, in which a single rate Poisson model is compared 
with a dual rate Poisson model for the data, for all possi- 
ble choices of changepoints coincident with event times. If 
the dual rate model is more likely (by a factor PRIOR, nom- 
inally taken to have the value two) the section of data is 
segmented, and the two segments are themselves subject to 
the test. This process is iterated. A segment is deemed com- 
plete when the single rate model is more likely, or when there 
is only one event in the segment. It should also be noted 
that the Bayesian blocks procedure requires event times in 
discrete timesteps, and to this purpose the peak times are 
rounded to the nearest minute. 

The rates in the Bayesian blocks before the last block 
contain information about how the flaring rate varies, and 
so are used to construct the prior A(Ai) in Equation (5). 
The model form 



was chosen for the prior, based on inspection of a rate distri- 
bution constructed from a Bayesian blocks decomposition of 
the GOES event lists for 1976-2003. For each prediction the 
parameters a, b, and c are determined, for the given one-year 
window of data, by requiring that the first three moments 
of the model match the first three moments of the data, es- 
timated from the Bayesian blocks results. Specifically we 
require 
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dAiA(Ai) = 1, 



cfAiAiA(Ai) = J2 Nm /J2 TBi > 

> 

dAiA?A(Ai) = Y, x BiTBi/Y, T Bii 



(7) 



where Nb% and T B i — Ib% — ts(i-i) denote the number of 
events in, and duration of, the i th Bayesian block respec- 
tively, where \si = Nsi/Tsi, and where the summations 
exclude the last block. As shown in the Appendix, these 
three conditions uniquely determine values of a, b, and c. 

Once 7*, M' and T' have been determined, and the prior 
has been constructed, Equation (5) is used (with S2 = Sm 
and £2 = Sx) to construct posterior distributions Pjyi(e) and 
Px(e) for the probability of occurrence of events above M 
size and above X size respectively. The means of these dis- 
tributions are taken as suitable estimates em and ex of the 
probabilities of at least one M class event (or larger), and at 
least one X class event within AT = 1 day: 



em 
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deeP M (e), ex 



deeP x (e) 



(8) 



Corresponding uncertainties ctm and ax may be obtained 
in the usual way from the first and second moments of the 
posterior distributions. 

Another quantity of interest is the probability of getting 
at least one flare of M class, i.e. a flare with peak flux greater 
or equal to 10~ 5 Wm~ 2 , but less than 10~ 4 Wm" 2 . To 
avoid ambiguity, we will refer to events with peak flux in 
this range as M-X flares. If e' describes the probability of 
at least one M class event, or larger, and e" describes the 
probability of at least one X class event, or larger, then the 
desired probability is e = e' — e". If we denote the corre- 
sponding posterior distribution Pvix(e) then we have 

PMx(e) = f de' f de"5(e-e' + e")P M (e')Px(e") 
Jo Jo 

de'P M (e')Px(e'-e). (9) 
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A(Ai) = aexp(— &Ai) 



(6) 



The estimate for this quantity is taken to be the mean of the 
posterior distribution, which is denoted emx, and similarly 
the associated uncertainty is denoted <tmx- 

3.2. Example: application to 4 November 2003 

To illustrate the application of the method, we consider 
a prediction for the time 00:00UT on 4 November 2003. 
The largest soft X-ray flare of the period of GOES obser- 
vations (1975-2004) occurred starting at 19:29UT on that 
day. The event saturated the GOES detectors, but was es- 
timated by NOAA to be X28, i.e. to have a peak flux of 
2.8 x 10~ 3 Wm~ 2 in the 1-8A band. The enhanced X-ray 
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Figure 2. Upper panel: pdf for peak flux of 1— 8A GOES 
events above threshold, for one year prior to 4 November 
2003. Lower panel: cumulative distribution. 

flux due to this flare caused a substantial lowering of the 
D-region of the ionosphere, suggesting an even higher peak 
flux of around X45 [Thomson, Rodger and Dowden, 2004]. 

Figures 2, 3 and 4 illustrate the prediction process for 
this day. 

Figure 2 shows the size distribution of the one-year win- 
dow of data (prior to 4 November 2003 00:00UT) used 
in the prediction. Only events above the threshold flux 
Si — 4 x 10~ 6 Wm -2 (which is indicated by a vertical line) 
are shown, and there are 480 events in all. The upper panel 
shows the pdf, and the lower panel the cumulative distribu- 
tion. The power-law model (with maximum likelihood index 
7* ~ 2.07±0.05) is indicated by a thick line. Once again we 
note that this value is significantly larger than other values 
for the distribution of soft X-ray peak flux quoted in the 
literature, because of the lack of subtraction of background 
flux. 

Figure 3 illustrates the inference on the rate. The upper 
panel shows the 480 events (indicated by crosses) above the 
threshold during the one year window, as a plot of peak flux 
versus time. The lower panel shows the result of applying 
the Bayesian blocks procedure to these data. The procedure 
decomposed the data into 13 blocks. The last block had a 
duration T' w 15.3 days and contained M' = 104 events. 

Figure 4 shows the posterior distributions i-W(e) (upper 
panel) and -Px(e) (lower panel). The estimates cmx and ex 
obtained from the means of the distributions are shown as 
vertical lines. The result is that the probability of at least 
one M-X event within one day is emx ~ 0.73 ± 0.03, and the 
probability of at least one event of X size within one day 
is ex ~ 0.19 ± 0.02. These values are quite high, reflecting 
the high rate of flaring immediately prior to 4 November. 
However, the results also show the limitations of probabilis- 
tic forecasting — the prediction for an X class event is only 
around 20%, yet the largest flare of the last three decades is 
imminent. 



Figure 3. Upper panel: Peak flux of 1 — 8A GOES events 
(crosses) above threshold versus time, for one year prior 
to 4 November 2003. Lower panel: Bayesian blocks de- 
composition of rate versus time. 

4. Test of the method 
4.1. Results 

To test the method, predictions were performed for each 
day of GOES event data for 1975-2003. Each prediction was 
compared with the historical fact of whether flares did or did 
not occur on the given day. A brief description of the test 
was given in Wheatland (2004b). Here the test is described 
in more detail, including comparison, in Section 4.2, of the 
prediction statistics with published NOAA predictions for 
1987-2002. 

Predictions were made according to the procedure out- 
lined in Section 3.1. Since the predictions use a one-year 
window of data, the results are only considered for 1976- 
2003, i.e. the predictions for 1975 are omitted from consid- 
eration because they are based on less than a year of previous 
data. 

Figure 5 summarizes the results, as plots of yearly num- 
bers of observed event days, i.e. days with one or more events 
(solid histograms), and yearly numbers of predicted event 
days (diamonds). The upper panel shows the results for 
M-X events, and the lower panel shows the results for X 
events. The numbers of predicted event days are the sums 
of the prediction estimates emx and ex over all days in a 
given year. Representative error bars are shown for the ob- 
servations, corresponding to the square root of the number 
of events. These error bars reflect the expected variability in 
the observed numbers. These plots suggest that the method 
does quite well in predicting overall event numbers, although 
some systematic over-prediction is apparent. 

Table 1 summarizes the statistics of the predictions for 
1976-2003, in a notation following meteorological practice 
[e.g., Murphy and Winkler, 1987]. To assess probabilis- 
tic predictions, the joint probability distribution for fore- 
casts (denoted /) and observations (denoted x) may be con- 
structed, and properties of this distribution examined. In 
the present context we have / = cmx or / = ex, for predic- 
tions for M-X or X events respectively. The value of x for 
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Figure 4. Upper panel: Posterior distribution for prob- 
ability of M-X events for 4 November 2003. Lower panel: 
Posterior distribution for X events for that day. 



each day is zero or one, according to whether an event did or 
did not occur. Averages over all days are denoted by (•••). 
For example, (/) is the average of the forecast probability 
over all days. As a second example, {f\x — 1) is the average 
of the forecast probability over all days on which at least 
one flare did occur. Standard deviations over all days are 
denoted by a. 'MAE' denotes the mean absolute error: 

UAE{f,x) = (\f-x\), (10) 

and 'MSE' denotes the mean square error: 

MSE(f,x) = ((f-xf). (11) 

The linear association is the correlation of / and x. 

Table 1 also gives the climatological skill score [e.g., Mur- 
phy and Epstein, 1989], defined by 

SS(/» = l-MSE(/,a;)/MSE((a;),a;) 

= l-MSE(f,x)/a 2 x , (12) 

which is a measure of the improvement of the forecasts over 
a constant forecast given by the average. Perfect prediction 
(/ — x) corresponds to a skill score of unity. A positive 
skill score indicates better performance, and a negative skill 
score worse performance, with respect to the average. 

The table indicates that the method performs quite well 
in describing the overall frequency of occurrence of flares: we 
have (smx) ~ 0.282, whereas M-X events occurred on a frac- 
tion 0.254 of days; and (ex) ~ 0.040, whereas X class events 
occurred on a fraction 0.036 of days. However, these values 
also confirm that there is a tendency for over-prediction. 

The method also has good discrimination, i.e. it assigns 
substantially higher values to / on days on which events 
occurred, compared with non-event days. The skill scores 
for the method are 0.272 (for M-X events) and 0.066 (for X 
events) . 



Another way to summarize the results is in terms of 'reli- 
ability plots,' which show the success of the predictions as a 
function of emx or ex- Figure 6 shows the reliability plot for 
M-X events. This diagram is constructed as follows. The 
predictions emx for all days are sorted into bins of width 
0.05. For each bin, the observed number of those days on 
which at least one event did occur is used to estimate the 
underlying probability of an event on those days, and this 
is the vertical value for the bin. Specifically, the estimate 
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Figure 5. Comparison of predictions and observa- 
tions for 1976-2003. Upper panel: Events days for M- 
X events: observed (histogram); predicted (diamonds). 
Lower panel: the same, but for X events. 
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Figure 6. Reliability plot for prediction of M-X events 
for 1976-2003. The horizontal axis shows the probabil- 
ities assigned in the predictions, and the vertical axis 
shows probabilities derived from the observed frequen- 
cies of event occurrence. 
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used is the Bayesian estimate assuming binomial statistics 
and a uniform prior: if there are R days with at least one 
event out of a total of S days, then the estimate for the 
probability is p = (R + 1)/(S + 2) (Laplace's rule of succes- 
sion), and the associated uncertainty is [p(l — p)/(5 + 3)] 1 ^ 2 
[e.g., Jaynes, 2003]. On a reliability plot, perfect prediction 
corresponds to a 45 degree line, which is indicated by the 
solid line in Figure 6. This figure shows that the method has 
performed quite well for prediction of M-X events. There is 
some over-prediction (the points fall below the perfect pre- 
diction line) for days on which the forecast has moderate 
values (eivix ~ 0.25 — 0.65). 




Forecast probability 



Figure 7. Reliability plot for prediction of X events for 
1976-2003 (see Figure 6). 
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Figure 7 shows the reliability plot for prediction of X 
events for 1976-2003. The predictions are conservative, in 
that ex is less than about 0.5 for all days. Once again the 
method appears to perform quite well, with some tendency 
to over-prediction. 

Although we have argued that the present prediction 
method involves relatively few ad hoc choices, there are a 
number of free parameters, in particular PRIOR, the prior ra- 
tio involved in the Bayesian blocks routine, Si , the threshold 
for the power-law model, T, the length of the data window 
used in the predictions, as well as the functional form (6) 
for the prior rate distribution. It is interesting to briefly ex- 
amine the effect on the predictions of varying some of these 
choices. Table 2 shows the effect on (emx) and (ex) of vary- 
ing these choices one by one. The observed means are given 
at the bottom. This table suggests that the method is rela- 
tively insensitive to the choices PRIOR and T. If the thresh- 
old Si is halved, the predictions are higher, and hence worse. 
This effect is probably due to the departure from power-law 
behavior at small sizes (see Figure 1) causing the inferred 
power-law index to be too small. Table 1 also indicates 
that if the prior for the rate is ignored [corresponding to the 
choices a = 1, b = 0, c = 1 in Equation (6)] the predictions 
are worse. This shows that the Bayesian blocks before the 
last are providing useful information for prediction. 

4.2. Comparison with NOAA predictions 

As described in Section 1, the NOAA uses the Mcintosh 
expert system [Mcintosh, 1990] to make flare predictions. 
The NOAA publishes web pages with tables of statistics de- 
scribing the reliability of its flare forecasts for the period 
1987-2003. 5 Using these tables, it is possible to compare the 
NOAA predictions with those of the present method. 

Table 3 compares the predictions of the two methods for 
1987-2003. The table lists the means of the forecast prob- 
abilities / = £mx and / = ex for the present method and 
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Figure 8. Comparison of observed event days (solid 
histograms), the predictions of the present method (di- 
amonds), and NOAA predictions (asterisks), for 1987- 
2003. Upper panel: M-X events. Lower panel: X events. 



Figure 9. Comparison of skill scores for the present 
method (diamonds), and for NOAA predictions (aster- 
isks), for 1987-2003. Upper panel: M-X events. Lower 
panel:X events. 
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for the NOAA method, as well as the means of the observed 
values x. The present method gives mean prediction proba- 
bilities closer to the observations for both M-X and X events. 
For M-X events, both methods show similar over-prediction, 
although the present method is slightly better. For X events 
the present method gives substantially improved mean pre- 
diction probabilities compared with the NOAA method. 

The average of / is an incomplete measure of the success 
of a prediction method, since it ignores e.g. whether high 
predictions are assigned on event days, and low predictions 
on non-event days. Hence Table 3 also lists the average fore- 
casts for event days and non-event days, the mean absolute 
error, the mean square error, and the skill scores, for the two 
methods. The values of {f\x = 1) and (f\x = 0) show that 
the NOAA method is somewhat more discriminating than 
the present method, for both M-X and X event prediction. 
However, the present method has a lower mean prediction 
for non-event days for X class flares. The mean absolute 
and mean square errors suggest that the NOAA method is 
slightly more accurate for M-X event prediction, but less ac- 
curate for X event prediction. The overall skill scores also 
support this. Notably the NOAA skill score for X event 
prediction is negative. 

It is also interesting to compare the predictions and obser- 
vations on a year by year basis. Figure 8 shows the predicted 
numbers of event days for the present method (diamonds), 
the predicted numbers of event days for the NOAA method 
(asterisks), as well as the observed number of event days 
(solid histograms), for each year in the period 1987-2003. 
The upper panel shows the results for M-X events, and the 
lower panel for X events. The predictions for M-X events 
for the two methods show a comparable scatter around the 
observed values. The predictions for X events are much bet- 
ter in the case of the present method, in particular for cycle 
23. 

Figure 9 compares the skill scores [Equation (12)] for the 
predictions by the two methods, on a year by year basis. The 
scores for the present method are shown by diamonds, and 
the scores for the NOAA method by asterisks, with the up- 
per panel showing the results for M-X events, and the lower 
panel X events. The NOAA method is seen by this measure 
to be slightly better at predicting M-X events but worse at 
predicting X events (and in particular had two years with 
very poor X event predictions). 

5. Discussion 

A practical implementation of an event statistics ap- 
proach to solar flare prediction [Wheatland 2004a] is demon- 
strated for daily whole-Sun prediction of GOES soft X-ray 
flares, and is illustrated by application to 4 November 2003, 
the day of the largest recorded GOES flare (Figures 2, 3 
and 4). The method makes predictions based only on the 
observed history of flaring by exploiting the phenomenolog- 
ical rules of flare occurrence, in particular the power-law 
distribution of flares in size (Figure 1). The method is sim- 
ple, both conceptually and in terms of implementation, and 
involves few ad hoc assumptions. It requires no special- 
ized observations, only the NOAA solar event lists. As such 
it is well-suited to providing a first guess for the probabil- 
ity of flare occurrence. In principle additional observations 
of physical parameters related to flaring could be used to 
improve upon the basic prediction provided by the present 
method, and approaches to this problem will be considered 
in future work. 

The method has been tested on prediction of GOES 
events for each day in the period 1976-2003. The test was 
briefly described in Wheatland [2004b], and a detailed ac- 
count is given here. The method is found to perform well 



in assigning probabilities to the occurrence of events in the 
range M to X ('M-X events') and to the occurrence of X 
events, although there is a tendency to over-prediction, in 
particular for M-X events. A number of measures of success 
of the method are examined, including predicted numbers 
of event days (Figure 5) reliability plots (Figures 6 and 7), 
and verification statistics (Table 1). In particular, the skill 
scores for M-X event prediction and X event prediction are 
found to be 0.272 and 0.066 respectively. 

The results of the test are found to only weakly depend 
on a number of chosen parameters in the method, in particu- 
lar a prior ratio for segmentation in the the Bayesian blocks 
procedure used to determine the current rate, and the length 
of the data window used (see Table 2). However, the pre- 
dictions require an accurate choice of the threshold for the 
power-law size distribution, and benefit from the inclusion 
of prior information on the rate. 

The present method is also compared with the long- 
standing NOAA prediction method, using the NOAA's pub- 
lished prediction statistics for 1987-2003 (Table 3 and Fig- 
ures 8 and 9) . The event statistics method is found to out- 
perform the NOAA method in predicting overall numbers 
of M-X and X event days. In particular the NOAA method 
is observed to seriously over-predict X class events (lower 
panel, Figure 8). The NOAA method is found to be slightly 
more accurate in M-X event prediction, but less accurate in 
X event prediction, based on skill scores and other valida- 
tion statistics (Table 3). Overall the present method pro- 
vides improved prediction of X class flares [e.g. compare the 
skill scores SS(/, x) in Table 3]. This is significant because 
X flares, although infrequent, are the most important flares 
from the point of view of space weather. It is perhaps sur- 
prising that the present method fares as well as it does, given 
the amount of background information incorporated into the 
NOAA forecasts [Mcintosh, 1990] . These results support the 
contention that the event statistics method is well suited to 
providing a baseline forecast, upon which other methods can 
improve. 

The tendency to over-predict moderate sized flares is still 
being investigated, but it is likely that it stems from the 
determination of the current rate. The Bayesian blocks pro- 
cedure is always trying to 'catch up' with variations in the 
Sun's flaring rate. If the Bayesian blocks method is system- 
atically late in detecting a sudden decline in rate (e.g. due 
to the decay of an active region, or its rotation off the disk), 
then there will be a period of over-prediction. A specific 
issue with the implementation of the Bayesian blocks pro- 
cedure is that there must be at least one event in a block, 
so that the inferred rate is never identically zero, even if the 
true rate is zero. This is expected to lead to overestimation 
of the rate at times of low activity. A related problem is 
that it is intrinsically difficult to accurately determine low 
rates because of the absence of events. In future work these 
questions will be examined more carefully. It should also be 
noted that the Bayesian blocks procedure is not guaranteed 
to find the optimal decomposition [Scargle, 1998] . Recently 
Scargle described a new, optimal Bayesian blocks algorithm 
[Scargle 2004] , and in future the new method will be applied 
to flare prediction. 

In Section 3 it was noted that the 1 - 8A GOES peak 
fluxes used here are not background subtracted. For many 
applications of soft X-ray data, e.g. determining intrinsic 
properties of flares, it is essential to perform accurate back- 
ground subtraction [e.g. Bornmann, 1990]. It is also impor- 
tant in statistical studies where the concern is with the dis- 
tribution of the intrinsic quantity, for example in studies in- 
vestigating what the distributions of peak flux reveal about 
underlying physical processes. However, in the present con- 
text background subtraction is unnecessary. As noted in 
Section 3, provided the peak fluxes (including background) 
are power-law distributed, they are suitable for predictive 
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purposes. This is an advantage of the method, in that read- 
ily available data (GOES event lists) may be used as the 
basis for a prediction. However, the lack of background sub- 
traction means that the peak flux distributions constructed 
here cannot be readily compared with other published dis- 
tributions. In particular, the power-law indices obtained 
here are typically larger than two, whereas a large body 
of literature reports that the intrinsic distribution of X-ray 
peak flux has an index in the range 1.7 — 1.9 [e.g. Drake, 
1971; Hudson, Peterson and Schwartz, 1969; Hudson, 1991; 
Lee, Petrosian and McTiernan, 1995; Shimizu, 1995; Feld- 
man, Doschek and Klimchuk, 1997; Aschwanden, Dennis 
and Benz, 1998]. 

Although convenient, the GOES event lists also have a 
specific shortcoming for predictive purposes. Events are 
selected against a substantial and time-varying soft X-ray 
background. At times of high solar activity, more events are 
missed because of the increased background [e.g., Wheat- 
land, 2001], and this leads to a relatively large threshold 
for power- law behavior of the peak fluxes (see Figure 1), 
which is a disadvantage for the present method. In future 
other datasets will be considered. However, the advantages 
of the GOES event lists are their availability, and the close 
relationship of soft X-ray peak flux to an important space 
weather effect (ionization of the upper atmosphere). 

Predictions made using the method described in this pa- 
per are now published daily on the web. 6 The web pages also 
include running measures of how accurate the published pre- 
dictions are, in the form of automatically updated plots of 
reliability and skill scores. The codes used to make the pre- 
dictions are written in the Interactive Data Language (IDL)- 
based SolarSoft system [Freeland and Handy, 1998], which 
has become the de facto standard for solar data analysis. 
All codes are available on request from the author. 

Appendix 

The moments of the model distribution (6) are given by 



/>oo 

(A") = a Arexp(-6A5)dAi 
Jo 

- 6 ( a + l)/c c i \ c J' 



(13) 



where F(x) is the Gamma function. Hence the three moment 
equations (7) may be written 

r(i/c) = i 



a 

b 2 ' c c 
a 

b 3 / c c 



r(2/c) = Ai 
r(3/c) = A?, 



(14) 



where Ai and Af denote the right hand sides of the second 
and third of equations (7). 

Eliminating a between the first and second of Equa- 
tions (14) gives 



6 _ 1/C r(2/ C ) 



r(i/ c ) 



Ai, 



(15) 



and eliminating a between the first and third of Equa- 
tions (14) gives 



-2/ 



c r(3/c) _-2 



r(i/c) 



Xf. 



(16) 



Eliminating b between Equations (15) and (16) gives the 
transcendental equation for c: 

[r(2/ c )f Af - (aT) 2 r(i/c)r(2/c) = o. (17) 



This equation needs to be solved for c, e.g. using Newton- 
Raphson, for a suitable initial guess, and then Equation (15) 
gives b: 



b= [r(i/c)Ai/r(2/c)]" 

and the first of Equations (14) gives a: 
a = b 1/c c/T{l/c). 



(18) 
(19) 
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Notes 

1. see http://www.ips.gov.au 

2. see http://www.sec.noaa.gov/ftpdir/latcst/dayprc.txt 

3. see http://beauty.nascom.nasa.gov/arm/latest/ 

4. see ftp://ftp.ngdc.noaa.gov 

5. see http://www.sec.noaa.gov/forecast_verification/ 

6. sec http://www.physics.usyd.edu.au/~wheat/prcdiction/ 
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Table 1. Verification statistics for the prediction of M-X 
and of X events for 1976-2003 





M-X 


X 


Total days 


10226 


10226 


Event days 


2600 


365 


(f) 


0.282 


0.040 


(x) 


0.254 


0.036 


Median / 


0.218 


0.017 


J 


0.251 


0.058 


Ox 


0.435 


0.186 


(f\x = l) 


0.508 


0.120 


(f\x = 0) 


0.204 


0.038 


Median f\x = 1 


0.565 


0.105 


Median f\x = 


0.116 


0.016 


Std. dev. f\x = 1 


0.216 


0.084 


Std. dev. f \x = 


0.212 


0.055 


MAE(/, x) 


0.277 


0.068 


MSE(/,x) 


0.138 


0.032 


Linear association 


0.528 


0.262 


SS(/,x) 


0.272 


0.066 



Table 2. Dependence of prediction of M-X and of X events 
on free parameters 
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0.280 
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2 x 10 


-6 
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yes 


0.286 


0.052 
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4 x 10 
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yes 


0.285 


0.041 
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4 x 10 


-6 
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no 


0.289 
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Observed: 


0.254 


0.036 



Table 3. Comparison of predictions of present method with 
NOAA predictions, 1987-2003 



Present NOAA 
method method 





M-X 
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M-X 


X 


(/> 


0.294 


0.040 


0.298 


0.064 


<x) 


0.262 


0.035 


0.262 


0.035 


U> = i> 


0.510 


0.122 


0.551 


0.244 


</|x = 0) 


0.217 


0.037 


0.208 


0.057 


MAE(/, x) 


0.289 


0.066 


0.271 


0.081 


MSE(/, x) 


0.143 


0.031 


0.139 


0.032 


SS(/,x) 


0.258 


0.078 


0.262 


-0.006 



